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Abstract. We introduce a highly structured family of hard satisfiable 
3-SAT formulas corresponding to an ordered spin-glass model from sta- 
tistical physics. This model has provably "glassy" behavior; that is, it 
has many local optima with large energy barriers between them, so that 
local search algorithms get stuck and have difficulty finding the true 
ground state, i.e., the unique satisfying assignment. We test the hard- 
ness of our formulas with two Davis-Putnam solvers, Satz and zChaff , 
the recently introduced Survey Propagation (SP) , and two local search al- 
gorithms, WalkSAT and Record-to- Record Travel (RRT). We compare our 
formulas to random 3-XOR-SAT formulas and to two other generators 
of hard satisfiable instances, the minimum disagreement parity formulas 
of Crawford et al., and Hirsch's hgen2. For the complete solvers the run- 
ning time of our formulas grows exponentially in yfn, and exceeds that 
of random 3-XOR-SAT formulas for small problem sizes. SP is unable to 
solve our formulas with as few as 25 variables. For WalkSAT, our formu- 
las appear to be harder than any other known generator of satisfiable 
instances. Finally, our formulas can be solved efficiently by RRT but only 
if the parameter d is tuned to the height of the barriers between local 
minima, and we use this parameter to measure the barrier heights in 
random 3-XOR-SAT formulas as well. 



1 Introduction 

3-SAT, the problem of deciding whether a given CNF formula with three literals 
per clause is satisfiable, is one of the canonical NP-complete problems. Although 
it is believed that it requires exponential time in the worst case, many heuristic 
algorithms have been proposed and some of them seem to be quite efficient on 
average. To test these algorithms, we need families of hard benchmark instances; 
in particular, to test incomplete solvers we need hard but satisfiable instances. 
Several families of such instances have been proposed, including quasigroup com- 
pletion |21I15I1| and random problems with one or more "hidden" satisfying 
assignments |3I24I2) . 

In this paper we introduce a new family of hard satisfiable 3-SAT formulas, 
based on a model from statistical physics which is known to have "glassy" be- 
havior. Physically, this means that its energy function has exponentially many 
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local minima, i.e., states in which any local change increases the energy, and 
which moreover are separated by energy barriers of increasing height. In terms 
of SAT, the energy is the number of dissatisfied clauses and the global minimum, 
or "ground state," is the unique satisfying assignment. In other words, there are 
exponentially many truth assignments which satisfy all but a few clauses, which 
are separated from each other and from the satisfying assignment by assignments 
which dissatisfy many clauses. Therefore, we expect local search algorithms like 
WalkSAT to get stuck in the local minima, and to have a difficult time finding 
the satisfying assignment. 

We start with a spin-glass model introduced by Newman and Moore 18 
and also studied by Garrahan and Newman [TI]. It is like the Ising model, 
except that each interaction corresponds to the product of three spins rather 
than two; thus it corresponds to a family of 3-XOR-SAT formulas. Random 3- 
XOR-SAT formulas, which correspond to a similar three-spin interaction on a 
random hypergraph and which are also known to be glassy, have been studied 
by Franz, Mezard, Ricci-Tersenghi, Weigt, and Zecchina [10122116] . Barthel et 
al. and Cocco, Dubois, Mandler, and Monasson j^J. In contrast, the Newman- 
Moore model is defined on a simple periodic lattice, so it has no disorder in its 
topology. 

We test our formulas against five leading SAT solvers: two complete solvers, 
zChaff and Satz, and three incomplete ones, WalkSAT, RRT and the recently 
introduced SP. We compare them with random 3-XOR-SAT formulas, and also 
with two other hard satisfiable generators, the minimum disagreement parity 
formulas of Crawford et al. and Hirsch's hgen.2 ^2]- For Davis-Putnam solvers, 
our formulas are easier than random 3-XOR-SAT formulas of the same density 
in the limit of large size, although they are harder below a certain crossover 
at about 900 variables. For SP, both our formulas and random 3-XOR-SAT 
formulas appear to be impossible to solve beyond very small sizes. For WalkSAT, 
our formulas appear to be harder than any other known generator of satisfiable 
instances. We believe this is because our formulas' lattice structure gives them 
a very high "configurational entropy," i.e., a very large number of local minima, 
in which local search algorithms like WalkSAT get stuck for long periods of time. 

The RRT algorithm solves our formulas efficiently only if the parameter d is 
set to the barrier height between local minima, which for our formulas we know 
exactly to be log 2 L + 1. Although the barrier height in random 3-XOR-SAT 
formulas seems to grow more quickly with n than in our glassy formulas, when 
y/n = L < 13 our formulas are harder for RRT than random 3-XOR-SAT formulas 
of the same density, even when we use the value of d optimized for each type of 
formula. We propose using RRT to measure barrier heights in other families of 
instances as well. 

2 The model and our formulas 

The Newman-Moore model ^3] consists of spins ai j — ±1 on a triangular lattice. 
Each spin interacts only with its nearest neighbors, and only in groups of three 
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lying at the vertices of a downward-pointing triangle. If we encode points in 
the triangular lattice as where the neighbors of each point are (i ± 

± 1), and (i ± 1, j =p 1), the model's Hamiltonian (energy function) is 

Let us re-define our variables so that they take Boolean values, Sjj G {0,1}. 
Then, up to a constant, the energy can be re-written 

H = y^(si,j + Sjj + i + Si+ij) mod 2 

In particular, we will focus on the case where the lattice is an L x L rhombus 
with cyclic boundary conditions; then 

L-l 

H = ^ (Sij + Sij + i r „od L + Si+1 mod L,j) mod 2 . 
i,j=0 

Clearly we can think of this as a set of L 2 3-XOR-SAT clauses of the form 

s i,j © s i,j + l mod L ffi Si+i mo d L,j = 

in which case H is simply the number of dissatisfied clauses. Each one of these 
can then be written as a conjuction of four 3-SAT clauses, 

, 7 -h 1 mod L v o%-\-l mod L. 3 ) A V S i<j+1 mod L V S l+1 mod L,j) 

A (Si j V Sl 

, 7 -h 1 mod L v o%-\-l mod /j. j) A (Si,j V Sij+l mod L V mod L,j) 

producing a 3-SAT formula with L 2 variables and 4L 2 clauses for a total of 12L 2 
literals. There is always at least one satisfying assignment, i.e., where Sij = for 
all However, using algebraic arguments |18) one can show that this satisfying 
assignment is unique whenever L has no factors of the form 2 m — 1, and in 
particular when L is a power of 2. 

To "hide" this assignment, we flip the variables randomly; that is, we choose 
a random assignment A — (ojj) £ {0, 1} L and define a new formula in terms 
of the variables xij ~ Sij ® djj. While some other schemes for hiding a random 
satisfying assignment in a 3-SAT formula create an "attraction" that allows 
simple algorithms to find it quickly, Barthel et al. @] pointed out that for XOR- 
SAT formulas these attractions cancel and make the hidden assignment quite 
difficult to find. (Another approach pursued by Achlioptas, Jia, and Moore is 
to hide two complementary assignments in an NAESAT formula [2].) Of course, 
XOR-SAT is solvable in polynomial time by Gaussian elimination, but Davis- 
Putnam and local search algorithms can still take exponential time on random 
XOR-SAT formulas 

In general, XOR-SAT formulas have local minima because flipping any vari- 
able will dissatisfy all the currently satisfied clauses it appears in. However, the 
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lattice structure of the Newman-Moore model allows us to say much more. In 
particular, if we call an unsatisfied XOR-clause a "defect," then if L is a power of 
2, there is exactly one state of the lattice for any choice of defect locations |18) . 
To see this, consider the state shown in Figure Q Here there is a single defect 
(the three cells outlined in black) in which just one XOR-SAT clause (in fact, 
just one 3-SAT clause) is dissatisfied. However, since satisfying the XOR-SAT 
clause at i,j implies that 

the truth values below the defect are given by a mod-2 Pascal's triangle. If L is 
a power of 2 the L'th row of this Pascal's triangle consists of all O's, so wrapping 
around the torus matches its first row except for the defect. 

This gives a truth assignment which satisfies all but one clause. Moreover, 
this assignment has a large Hamming distance from the satisfying assignment; 
namely, the number of l's in the Pascal's triangle, which is H(L) = i log 2 3 
since it obeys the recurrence H(2L) = 3H(L). It also has a large energy barrier 
separating it from the satisfying assignment: to fix the defect with local moves 
it is necessary to first introduce log 2 L additional defects [T%] . 

Now, by taking linear combinations (mod 2) of single-defect assignments 
we can construct truth assignments with arbitrary sets of defects, and whenever 
these defects form an independent set on the triangular lattice, the corresponding 
state is a local energy minimum. Thus the number of local minima equals the 
number of independent sets, which grows exponentially as k l where k » 1.395 
is the hard hexagon constant |11I5| . 
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1. A local minimum with a single defect. Grey and white cells correspond to 
L j — 1 and respectively; the XOR-SAT clause corresponding to the three cells 



outlined in black is dissatisfied, and all the others are satisfied. The Hamming 
distance from the satisfying assignment is the number of grey cells, L log2 3 = 27 
since L = 8. 
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To recap, when L = 2 fe , there is a unique satisfying assignment. The system is 
glassy in that there are many truth assignments which are far from the satisfying 
assignment, but which satisfy all but a small number of clauses. Escaping these 
local minima requires us to first increase the number of unsatisfied clauses by 
roughly log 2 L. Newman and Moore [TS] studied the behavior of this model under 
simulated annealing, and found that the system is unable to find its ground state 
unless the cooling rate is exponentially slow; similarly, we expect the running 
time of local search algorithms like WalkSAT to be exponentially large. 

Below, we compare our formulas to random satisfiable 3-XOR-SAT formulas, 
which were proposed in 122 (and also in 0] as the special case po = 1/4). 
These are formed with a random hidden assignment in the following way: given 
variables x\, . . . ,x n , select a random truth assignment A € {0,1}™. Then, m 
times, select a triple Xi,Xj,Xk uniformly without replacement, and add the 3- 
XOR clause consistent with A, To compare with 

our formulas, we set n = m = L 2 so the resulting 3-XOR-SAT formula has a 
density of one clause per variable. 

3 Experimental results 

3.1 Davis-Putnam solvers: zChaff and Satz 

We obtained zChaf f from the Princeton web site and Satz from the SATLIB 
web site ^3]. Figure [21 shows a log-log plot of the median number of decisions or 
branches that zChaf f and Satz took as a function of the lattice size L. For both 
algorithms the slope for our glassy formulas is roughly 1, indicating that the 
running time for zChaf f and Satz to solve our formulas grows as 2 L = 2^. The 
reason for this is that, due to a process similar to bootstrap percolation |13) . 
when a sufficient number of variables are set by the algorithm (for instance, 
the variables in a single row) the remainder of the variables in the lattice are 
determined by unit propagation. For random 3-XOR-SAT formulas, the running 
time is exponential in n — L 2 , but with a smaller constant, so that for L < 30 
(i.e., n < 900) our formulas are harder than random 3-XOR-SAT formulas of 
the same size. 

3.2 SP 

SP is an incomplete solver recently introduced by Mezard and Zecchina ^2] based 
on a generalization of belief propagation called survey propagation. For random 
3-SAT formulas it is extremely successful; it can find a satisfiable assignment 
efficiently for random 3-SAT formulas up to size n = 10 7 near the satisfiability 
threshold m/n w 4.25 where random 3-SAT appears to be hardest. 

We found that SP cannot solve our formulas for L > 5, i.e., with n = 25 
variables. The cavity biases continue to change, and never converge to a fixed 
point, so no variables are ever set by the decimation process. There are several 
possible reasons for this. One is the large number of local minima; another is that 
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Fig. 2. The number of branches made by zChaf f and Satz on our formulas and 
on random 3-XOR-SAT formulas of the same size and density, as a function of 
the lattice size L. The running time for random 3-XOR-SAT is exponential in 
L 2 = n, while for our formulas it is exponential in L = ^Jn. Nevertheless, for 
small values of n our formulas are harder. Each point is the median of 25 trials; 
for our formulas, only values of L for which the satisfying assignment is unique 
are shown. 
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the symmetry in XOR clauses may produce conflicting messages; another is that 
our formulas have small loops which violate SP's assumption that the formula 
is locally treelike and that neighbors are statistically independent. (Random 3- 
XOR-SAT formulas are also quite hard for SP, although we found that SP solved 
about 25% of them with n = m = 25.) 

3.3 Local algorithms: WalkSAT 



WalkSat performance on glassy and random XOR formulas 




Fig. 3. The median number of flips made by WalkSAT on our formulas and 
random 3-XOR-SAT formulas of the same size. For our formulas, only values 
of L for which the satisfying assignment is unique are shown. Each point is the 
median of 25 trials. 

WalkSAT 20] is an algorithm which combines a random walk search strategy 
with a greedy bias towards assignments with more satisfied clauses. WalkSAT 
has been shown to be highly effective on a range of problems, such as hard 
random k-SAT problems, graph coloring, and the circuit synthesis problem. We 
performed trials of up to 10 9 flips for each formula, without random restarts, 
where each step does a random or greedy flip with equal probability. Figure |31 
shows a semi- log plot of the median number of flips as a function of n = L 2 . We 
only choose four different values of L, namely 5, 8, 10 and 11, because WalkSAT 
was unable to solve the majority of formulas with larger values of L (for which 
the satisfying assignment is unique) within 10 9 flips. 
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For both our formulas and random 3-XOR-SAT formulas, the median running 
time of WalkSAT grows exponentially in n. However, the slope of the exponential 
is considerably larger for our formulas, making them much harder than the 
random ones. We believe this is due to a larger number of local minima. 

3.4 Local algorithms: RRT 

RRT '9ll9j is a variant of WalkSAT which works as follows: 

1. Start from a random truth assignment; 

2. Randomly choose a variable from an unsatisfied clause; 

3. Flip it if this flip leads to a configuration that has at most d more unsatisfied 
clauses than the best configuration found so far (the "record"); don't flip 
otherwise; 

4. Repeat step 2 and 3 until it finds the satisfying truth assignment. 

Notice that d is a predefined constant and will not be changed during the 
RRT process. In order to solve a formula, we have to find the "right" d for RRT. If 
we choose d to be too small, RRT fails because it cannot escape the local minima; 
and if we choose d to be too big, it escapes the local minima but takes a long 
time to find the solution since it is not greedy enough to move toward it. 

We tested RRT on our formulas with L = 4,5,8,10,11,13 and 16 and we 
performed trials of up to 10 7 flips for each formula. Newman and Moore [TH| 
showed that the largest barrier height is log 2 L + l. In fact, it turns out that RRT 
solved our formulas efficiently only when d — log 2 L + l (see Figure . With 
L = 16 and d = 5, RRT solved our formulas in all of 50 trials with a median 
number of flips 1.10 x 10 6 ; but when we set d = 4 or 6, RRT can not solve any 
of the formulas with L = 16 within 10 7 flips. RRT may finds our glassy formula 
"easy" only if it knows the "right" value of d. 

We also tested RRT on random 3-XOR-SAT formulas with n = m = L 2 
ranging from 16 to 256 so the resulting 3-XOR-SAT formula has same density 
as our glassy formulas. Since we don't know the barrier height between local 
minima in these formulas, we tried RRT with different values of d to find the 
"right" d for each value of n. As a rough measurement of the barrier heights, 
we measured the value v for which RRT solved more than half the formulas with 
d = v but failed to solve half of them with d = v — 1. We set the maximum 
running time to 10 7 flips. 

Figure 01 shows the "right" value of d and the running time for each value of 
n. We see that the barrier height in random 3-XOR-SAT formulas seems to grow 
more quickly with n than in our glassy formulas. However, when ^fn = L < 13, 
our formulas are harder for RRT than random 3-XOR-SAT formulas of the same 
density, even when we use the value of d optimized for each type of formula. 

We find it interesting that RRT can be used to measure the barrier heights 
between local minima, and we propose to do this for other families of formulas 
as well. 




Fig. 4. The "right" value of d and the running time for our formulas and random 
3-XOR-SAT formulas of the same size. Shown is the number of flips per variable. 
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3.5 Comparison with other hard SAT formulas 

To further demonstrate the hardness of our glassy formulas, we compare them 
to other two generators of hard instances: the parity formulas introduced by 
Crawford et al. |Zj and the hgen2 formulas introduced by E.A. Hirsch The 
parity formulas of [7] are translated from minimal disagreement parity problems 
and are considered very hard. While hgen2 does not generate parity formulas, 
we include it because it produced the winner of the SAT 2003 competition for 
the hardest satisfiable formula |23| . 

We compared our glassy formulas with 10 formulas of Crawford et al., ob- 
tained from 0, and with 25 hgen2 formulas using the generator obtained from |12| 
We ran zChaf f , Satz, WalkSAT and SP; we did not test RRT on these formulas. 

For WalkSAT, we ran 25 trials of up to 10 9 flips each, and labeled the formula 
"not solved" if none of these trials succeeded. Comparing our glassy formulas 
with those of Crawford et al., taking similar numbers of variables and clauses 
(e.g. comparing our L = 16 formulas, which have 256 variables and 3072 clauses, 
with theirs with roughly 300 variables and 4000 clauses) we see from Table 1 
that our formulas are significantly harder than theirs for zChaff , Satz, and 
WalkSAT. (SP didn't solve any of these formulas, so it doesn't provide a basis for 
comparison.) Compared to hgen2 formulas with 195 variables and 3096 clauses, 
our formulas are not as hard for the complete solvers, but appear to be harder 
for WalkSAT, again perhaps due to their large number of local minima. 



Formulas 


Literals 


Variables 


Dec. (zChaff) 


Bran. (Satz) 


Flips (WalkSAT) 


par8-l-c.cnf 


732 


64 


17 


3 


1494 


par8-2-c.cnf 


780 


68 


9 


1 


2371 


par8-3-c.cnf 


864 


75 


18 


4 


5638 


par8-4-c.cnf 


768 


67 


7 


1 


2811 


par8-5-c.cnf 


864 


75 


12 


3 


4828 


parl6-l-c.cnf 


3670 


317 


2073 


1591 


2.5 x 10 8 


parl6-2-c.cnf 


4054 


349 


11117 


499 


1.3 x 10 s 


parl6-3-c.cnf 


3874 


334 


7505 


1489 


1.0 x 10 8 


parl6-4-c.cnf 


3754 


324 


2181 


4415 


1.4 x 10 8 


parl6-5-c.cnf 


3958 


341 


2758 


1296 


4.1 x 10 8 


Glassy 8x8 


768 


64 


167 


50 


219455 


Glassy 16 x 16 


3072 


256 


39293 


32219 


not solved 


Random XOR 


768 


64 


23 


3 


9167 


Random XOR 


3072 


256 


1427 


198 


3.9 x 10 s 


hgen2 


3096 


295 


not solved 


1478340 


751723 



Table 1. Comparison of our glassy lattice formulas with the parity formulas of 
Crawford et al., Hirsch's hgen2, and random 3-XOR-SAT formulas. 
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4 Conclusion 

We have introduced a new generator of hard satisfiable SAT formulas derived 
from a two-dimensional spin-glass model. We tested our formulas against five 
leading SAT solvers, and compared them with random 3-XOR-SAT formulas, 
the minimal disagreement parity formulas of Crawford et al., and Hirsch's hgen2 
generator. For complete solvers, the running time of our formulas grows exponen- 
tially only in L — ^Jn, but they are harder than random 3-XOR-SAT formulas 
when n is small. For SP our formulas appear to be impossible for n > 25 variables. 
For WalkSAT our formulas appear to be harder than any other known generator 
of satisfiable instances. Finally, the RRT algorithm solves our formulas only if 
d is set to the barrier height between local minima, which we know exactly to 
be log 2 L+l. We propose that RRT can be used to measure the barrier heights 
between local minima in other families of instances, and we have done this for 
random 3-XOR-SAT formulas. 

Since XOR-SAT is solvable in polynomial time, it would be interesting to 
have a provably glassy set of formulas which would be NP-complete to solve. 
One approach would be a construction along the lines of [7], where "noise" is 
introduced to the underlying parity problem so that it is no longer polynomial- 
time solvable. 

Finally, we feel that the highly structured nature of our formulas, which 
makes it possible to prove the existence of exponentially many local optima 
with large barriers between them, suggests an interesting direction for future 
work. For instance, are there families of formulas based on spin-glass models in 
three or more dimensions which would be even harder to solve? 
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